9 research outputs found

    Unpacking Large Language Models with Conceptual Consistency

    Full text link
    If a Large Language Model (LLM) answers "yes" to the question "Are mountains tall?" then does it know what a mountain is? Can you rely on it responding correctly or incorrectly to other questions about mountains? The success of Large Language Models (LLMs) indicates they are increasingly able to answer queries like these accurately, but that ability does not necessarily imply a general understanding of concepts relevant to the anchor query. We propose conceptual consistency to measure a LLM's understanding of relevant concepts. This novel metric measures how well a model can be characterized by finding out how consistent its responses to queries about conceptually relevant background knowledge are. To compute it we extract background knowledge by traversing paths between concepts in a knowledge base and then try to predict the model's response to the anchor query from the background knowledge. We investigate the performance of current LLMs in a commonsense reasoning setting using the CSQA dataset and the ConceptNet knowledge base. While conceptual consistency, like other metrics, does increase with the scale of the LLM used, we find that popular models do not necessarily have high conceptual consistency. Our analysis also shows significant variation in conceptual consistency across different kinds of relations, concepts, and prompts. This serves as a step toward building models that humans can apply a theory of mind to, and thus interact with intuitively

    Learning Compositional Visual Concepts with Mutual Consistency

    Full text link
    Compositionality of semantic concepts in image synthesis and analysis is appealing as it can help in decomposing known and generatively recomposing unknown data. For instance, we may learn concepts of changing illumination, geometry or albedo of a scene, and try to recombine them to generate physically meaningful, but unseen data for training and testing. In practice however we often do not have samples from the joint concept space available: We may have data on illumination change in one data set and on geometric change in another one without complete overlap. We pose the following question: How can we learn two or more concepts jointly from different data sets with mutual consistency where we do not have samples from the full joint space? We present a novel answer in this paper based on cyclic consistency over multiple concepts, represented individually by generative adversarial networks (GANs). Our method, ConceptGAN, can be understood as a drop in for data augmentation to improve resilience for real world applications. Qualitative and quantitative evaluations demonstrate its efficacy in generating semantically meaningful images, as well as one shot face verification as an example application.Comment: 10 pages, 8 figures, 4 tables, CVPR 201

    Confidence Calibration for Systems with Cascaded Predictive Modules

    Full text link
    Existing conformal prediction algorithms estimate prediction intervals at target confidence levels to characterize the performance of a regression model on new test samples. However, considering an autonomous system consisting of multiple modules, prediction intervals constructed for individual modules fall short of accommodating uncertainty propagation over different modules and thus cannot provide reliable predictions on system behavior. We address this limitation and present novel solutions based on conformal prediction to provide prediction intervals calibrated for a predictive system consisting of cascaded modules (e.g., an upstream feature extraction module and a downstream regression module). Our key idea is to leverage module-level validation data to characterize the system-level error distribution without direct access to end-to-end validation data. We provide theoretical justification and empirical experimental results to demonstrate the effectiveness of proposed solutions. In comparison to prediction intervals calibrated for individual modules, our solutions generate improved intervals with more accurate performance guarantees for system predictions, which are demonstrated on both synthetic systems and real-world systems performing overlap prediction for indoor navigation using the Matterport3D dataset

    Ultrasmall inorganic cages directed by surfactant micelles

    Get PDF
    Functional silica nanoparticles have become highly relevant materials in the fields of biology and medicine. Ultrasmall fluorescent silica nanoparticles developed in our group (Cdots) have now reached phase 2 of clinical trials for cancer diagnostics. Nevertheless, modern nanomedicine techniques and their increasing complexity today are still in demand for more efficient and multifunctional tools for advanced applications such as theranostics. To this end, important developments have been made in order for these nanoparticles to achieve their full potential, including chemical modification of their matrix to improve their optical properties, and new synthetic strategies for multifunctional nanoparticles via a surface modification approach with various functional groups. In parallel, new alternative particle geometries have been investigated for targeted drug delivery applications. In this contribution, we will review some of the recent progress made in our group that ultimately led to the discovery of highly symmetrical dodecahedral silica nanocages, or ‘silicages’ [1]. Ultrasmall (< 10 nm) silica nanoparticles with tunable geometries can be obtained through their templating with surfactant micelles. The self-assembly of silica clusters on these micelles gives rise to unique and well defined structures. The dodecahedral cage structure in particular is of great fundamental importance. It is the simplest of a set of Voronoi polyhedra suggested to form the smallest structural units of multiple forms of mesoporous silica, yet such highly symmetrical silica cages had never been isolated before. In order to resolve the actual structure of these ultrasmall objects, single-particle 3D reconstruction from tens of thousands of cryo-electron microscopy images was performed using a custom-built ‘Hetero’ machine learning algorithm. We will finally show that cage formation is not limited to silica, but has been observed for other materials including metals and transition metal oxides. The chemical and practical value of this polyhedral structure may prove immense. Given the versatility of silica surface chemistry one can readily conceive of cage derivatives of many kinds, which may exhibit unusual properties and be useful in applications ranging from catalysis to drug delivery. For example, given recent success in the clinical translation of ultrasmall fluorescent silica nanoparticles with similar particle sizes and surface properties to these cages, one can envisage a range of new diagnostic and therapeutic probes with drugs hidden inside the cages. Reference: [1] K. Ma, Y. Gong, T. Aubert, M. Z. Turker, T. Kao, P. C. Doerschuk, U. Wiesner, Nature 2018, DOI: 10.1038/s41586-018-0221-0

    Self-assembly of highly symmetrical, ultrasmall inorganic cages directed by surfactant micelles

    Get PDF
    Nanometre-sized objects with highly symmetrical, cage-like polyhedral shapes, often with icosahedral symmetry, have recently been assembled from DNA(1-3), RNA(4) or proteins(5,6) for applications in biology and medicine. These achievements relied on advances in the development of programmable self-assembling biological materials(7-10), and on rapidly developing techniques for generating three-dimensional (3D) reconstructions from cryo-electron microscopy images of single particles, which provide high-resolution structural characterization of biological complexes(11-13). Such single-particle 3D reconstruction approaches have not yet been successfully applied to the identification of synthetic inorganic nanomaterials with highly symmetrical cage-like shapes. Here, however, using a combination of cryo-electron microscopy and single-particle 3D reconstruction, we suggest the existence of isolated ultrasmall (less than 10 nm) silica cages ('silicages') with dodecahedral structure. We propose that such highly symmetrical, self-assembled cages form through the arrangement of primary silica clusters in aqueous solutions on the surface of oppositely charged surfactant micelles. This discovery paves the way for nanoscale cages made from silica and other inorganic materials to be used as building blocks for a wide range of advanced functional-materials applications

    COMPUTATIONAL IMAGE UNDERSTANDING INCORPORATING PHYSICS-BASED MODELING AND EMPIRICAL LEARNING FOR REAL-WORLD APPLICATIONS

    Full text link
    Challenging interdisciplinary applications inspire new methodological developments in data understanding. Two somewhat disjoint communities provide current solutions to data understanding. Statistical inference approaches based on abstract models allow incorporation of physics priors and parametric uncertainty. But to provide accurate models for complicated real-world data, one is often challenged by the curse of dimensionality. Alternatively, machine learning, especially the deep learning community, provides empirical descriptions of large complicated datasets. However, little prior knowledge is incorporated in current design of deep neural networks and such methods are often challenged by problems including data scarcity and limited transferability of the models. This dissertation includes methodological development in image understanding from each of the two perspectives: (1) Using statistical inference based on analytical models, 3-D spatial structure and temporal dynamics of nanoscale particles were reconstructed directly from large sets of cryo electron microscopy data. With a statistical framework incorporating the continuous heterogeneity among the imaged particles, a generative mechanical model was developed to provide sparse and analytical parametrization of the stochastic description of particle structure. This work contributes a systematic way to incorporate a fourth (temporal) dimension to the concept of 3D reconstruction. (2) Via deep neural networks-based machine learning approaches, the problem of concept learning in computer vision was investigated. Motivated by the challenge of data scarcity, a deep generative model-based framework, ConceptGAN, was developed to decompose data into transferable and composable semantic concepts and generatively recompose physically meaningful but unseen data, without complete training data over the joint latent space. It contributes a smart data augmentation technique which provides informative augmentation to improve the resilience of real-world applications. Finally, this dissertation concludes with a discussion on potential future research directions, in particular, on how methodological ideas from both the two perspectives of physics-based modeling and of deep learning can be fused to provide hybrid solutions that incorporate the strengths of both components, especially targeting real-world challenges including resilience, robustness, transferability and interpretability of the solutions
    corecore